PRECIS: An Automated Pipeline for Producing Concise Reports About Proteins
نویسندگان
چکیده
There have been several attempts at addressing the problem of annotating sequence data computationally. Annotation generation can be considered a pipeline of processes: first harvesting data from a variety of data sources, then distilling and transforming it into a form more appropriate for the end database. This task is usually performed by human annotators, a solution that is clearly not scaleable. There have been several attempts to mimic some of these pipelines in software. However, these have generally focused on low level annotation, such as database crossreferences, or by harvesting data from computational techniques such as gene finding or similarity searches. Higher level annotation such as that seen in the PRINTS database is usually formed from data that is free text, or only partly structured. This presents a much greater computational challenge. Therefore we studied the pipeline that is used to generate annotation for the PRINTS database, and have developed prototype software that reflects and automates this pipeline. As this software operates primarily on data culled from the SWISS-PROT database, we have called it PRECIS (Protein Reports Engineered from Concise Information in SWISS-PROT). This software is currently being used to generate annotation for the prePRINTS database. As the output is a structured report detailing the function, structure and disease associations of a protein, and providing literature references and keywords we believe it will be of more generic use. The software is available on request from [email protected] .
منابع مشابه
PRECIS: Protein reports engineered from concise information in SWISS-PROT
MOTIVATION There have been several endeavours to address the problem of annotating sequence data computationally, but the task is non-trivial and few tools have emerged that gather useful information on a given sequence, or set of sequences, in a simple and convenient manner. As more genome projects bear fruit, the mass of uncharacterized sequence data accumulating in public repositories grows ...
متن کاملConcerning the Effect of a Viscoelastic Foundation on the Dynamic Stability of a Pipeline System Conveying an Incompressible Fluid
In this paper, we present an analytical method for solving a well-posed boundary value problem of mathematical physics governing the vibration characteristics of an internal flow propelled fluid-structure interaction where the pipeline segment is idealized as an elastic hollow beam conveying an incompressible fluid on a viscoelastic foundation. The effect of Coriolis and damping forces on the o...
متن کاملEukaryotic Genome Annotation Pipeline
The NCBI Eukaryotic Genome Annotation Pipeline is an automated pipeline producing annotation of coding and non-coding genes, transcripts, and proteins on finished and unfinished public genome assemblies. It provides content for various NCBI resources including Nucleotide, Protein, BLAST, Gene, and the Map Viewer genome browser. The pipeline uses a modular framework for the execution of all ann...
متن کاملA computational pipeline to generate MHC binding motifs
BACKGROUND Major histocompatibility complex (MHC) class I molecules play key roles in host immunity against pathogens by presenting peptide antigens to CD8+ T-cells. Many variants of MHC molecules exist, and each has a unique preference for certain peptide ligands. Both experimental approaches and computational algorithms have been utilized to analyze these peptide MHC binding characteristics. ...
متن کاملThe application of PRECIS-2 ratings in randomized controlled trials of Chinese herbal medicine
This study tests the feasibility of applying the pragmatic-explanatory continuum indicator summary (version "PRECIS-2") tool to randomized controlled trials of Chinese herbal medicine. A search was conducted to identify potentially eligible randomized controlled trials. Using the PRECIS-2 tool, assessment of trials was performed independently by 2 evaluators using a scale of 1-5 for each criter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001